A Scalable Method for Analysis and Display of DNA Sequences
نویسندگان
چکیده
BACKGROUND Comparative DNA sequence analysis provides insight into evolution and helps construct a natural classification reflecting the Tree of Life. The growing numbers of organisms represented in DNA databases challenge tree-building techniques and the vertical hierarchical classification may obscure relationships among some groups. Approaches that can incorporate sequence data from large numbers of taxa and enable visualization of affinities across groups are desirable. METHODOLOGY/PRINCIPAL FINDINGS Toward this end, we developed a procedure for extracting diagnostic patterns in the form of indicator vectors from DNA sequences of taxonomic groups. In the present instance the indicator vectors were derived from mitochondrial cytochrome c oxidase I (COI) sequences of those groups and further analyzed on this basis. In the first example, indicator vectors for birds, fish, and butterflies were constructed from a training set of COI sequences, then correlations with test sequences not used to construct the indicator vector were determined. In all cases, correlation with the indicator vector correctly assigned test sequences to their proper group. In the second example, this approach was explored at the species level within the bird grouping; this also gave correct assignment, suggesting the possibility of automated procedures for classification at various taxonomic levels. A false-color matrix of vector correlations displayed affinities among species consistent with higher-order taxonomy. CONCLUSIONS/SIGNIFICANCE The indicator vectors preserved DNA character information and provided quantitative measures of correlations among taxonomic groups. This method is scalable to the largest datasets envisioned in this field, provides a visually-intuitive display that captures relational affinities derived from sequence data across a diversity of life forms, and is potentially a useful complement to current tree-building techniques for studying evolutionary processes based on DNA sequence data.
منابع مشابه
Population structure and variation in Persian sturgeon (Acipenser percicus ) from the Caspian Sea as determind from mitochondrial DNA sequences of the control region
Mitochondria1 DNA (mtDNA) control region sequences were analyzed to evaluate the population genetic structure of Persian sturgeon (Acipenser persicus) in Caspian Sea. A total of 45 specimens were collected from the different locations of the Caspian Sea. MtDNA control region was amplified using PCR. Direct sequencing was performed according standard method. The results showed that 12 haplotypes...
متن کاملAn Effective Method for Detecting Y-chromosome Specific Sequences of Circulating Fetal DNA in Maternal Plasma During the First-trimester
Background and Aims: New advances in the use of cell-free fetal DNA (cffDNA) in maternal plasma of pregnant women has provided the possibility of applying cffDNA in prenatal diagnosis as a non-invasive method. One of the applications of prenatal diagnosis is fetal gender determination. Early prenatal determination of fetal sex is required for pregnant women at risk of X-linked and some endocrin...
متن کاملDevelopment of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملA comparative phylogenetic analysis of Theileria spp. by using two two "18S ribosomal RNA" and "Theileria annulata merozoite surface antigen" gene sequences
More than 185 species, strains and unclassified Theileria parasites are categorized in the Entrez Taxonomy. The accurate diagnosis and proper identification of the causative agents are important for understanding the epidemiology, prevention and appropriate treatment. This study aims to discuss the importance of two genes of Theileria annulata 18S ribosomal RNA (18S rRNA) and Theileria annulata...
متن کاملSearching the genome of beluga(Husohuso) for sex markers based on targeted Bulked SegregantAnalysis (BSA)
In sturgeon aquaculture, where the main purpose is caviar production, a reliable method is needed to separate fish according to gender. Currently, due to the lack of external sexual dimorphism, the fish are sexed by an invasive surgical examination of the gonads. Development of a non-invasive procedure for sexing fish based on genetic markers is of special interest. In the present study we empl...
متن کاملSearching the genome of beluga (Huso huso) for sex markers based on targeted Bulked Segregant Analysis (BSA)
In sturgeon aquaculture, where the main purpose is caviar production, a reliable method is needed to separate fish according to gender. Currently, due to the lack of external sexual dimorphism, the fish are sexed by an invasive surgical examination of the gonads. Development of a non-invasive procedure for sexing fish based on genetic markers is of special interest. In the present study we empl...
متن کامل